-
Notifications
You must be signed in to change notification settings - Fork 580
feat: merge-train/avm #19282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
feat: merge-train/avm #19282
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is a really good one! The bug is both a soundness and completeness one. ### Preamble We perform addition (mod 32-bits) in Sha256Compression in two key places: during `perform_rounds` and at the end (aka `last`) when we add the outputs of the rounds to the original input. We check that the correctness of the modulo operation by doing ` c = a + b; c = c_hi * 2**32 + c_lo` and we check that `c_hi < 2**32 && c_lo < 2**32`. `c_lo` is then the result of our addition. ### The bug Turns out the selectors that turn on the range check for the modulo addition of the outputs with the inputs was wrong; it was using `perform_rounds` instead of `last`. ### Soundess `perform_rounds` and `last` are mutually exclusive so actually we were not checking that the modulo addition of the outputs with the inputs were correct! ### Completeness This is how the fuzzer found it. During `perform_rounds` the lookups for the outputs were still active (although they shouldnt have been). The active lookups were operating on default values, basically checking that `0 < 2**32`. Now in most cases if you perform "small" modulo addition (i.e 1 + 2 = 3), then `c_hi = 0` (i.e. the upper bits are empty). Because of the nature of lookups (i.e. many-to-1 relationships) the buggy lookup was "passing" as they would be an entry of `0 < 2**32` in the `GT subtrace` from some **other** modulo add. This means that the fuzzer found this bug because it found an input such that **ALL** modulo addition operations performed during the 64 rounds of XOR, ROTs, etc resulted in an overflow into the upper bits! This meant that the buggy lookup would fail (since there was no free `0 < 2**32`). This manifests as a `RANGE_COMP_H_LHS` error in `check_circuit`. Additionally the fuzzer needed to generate a bytecode that had no OTHER operation that would have placed a trivial 0 < 2**32 entry in the GT trace
### Preamble
The representation of the point at infinity differs between BB and noir,
with the latter using `{x: 0, y: 0, inf: 1}`. To ensure we match with
noir's expectations we wrap the bb `AffinePoint` in a struct
`StandardAffinePoint` that outputs the noir representation if you ask
for the coordinates of a point at infinity.
### The bug
The ECADD opcode loads points from addresses in memory (i.e. x-coord,
y-coord, inf-flag). It is possible that some bytecode loads a point such
that `x != 0, y != 0, inf = 1`. When this happens during simulation, the
execution trace will load the **original** coordinate values but when
the EC gadget is called, it utilises `StandardAffinePoint` which
incorrectly outputs `{x: 0, y:0, inf = 1` when building the EC events.
This causes a `Lookup PERM_EXECUTION_DISPATCH_TO_ECC_ADD failed.`
because the value loaded in the registers in the execution trace no
longer match the values in the ECC subtrace.
### Solution
The subtlety in `StandardAffinePoint` is that we only want the point at
infinity to be `{x: 0, y: 0, inf: 1}`, when the result of an operation
results in infinity. Otherwise we want to use the original coordinates
of the point. This means we need to track the original coordinate values
as well in the `StandardAffinePoint`.
For situations where `radix = 0, 1` we were not building the precomputed trace for `radix_safe_limbs`. Even though the `ToRadix` subtrace will error (since the radix is invalid). We need this lookup in the execution trace to pass for gas computation. The error was missing by our existing test suite and the to_radix fuzzer becuase the bug only manifests when: - You have invalid radix (0 or 1) - You generate the actual precomputed trace (which skipped rows 0 and 1) - You generate the actual execution trace (which tries to lookup radix=1) - You run the actual lookup constraint check
~Forcing that backfilled slice memory was `DIRECT` does not guarantee it's valid since `generate_address_ref` could have laid out the memory for indirect or relative. Note: the "proper" fix is the to do this [issue](https://linear.app/aztec-labs/issue/AVM-170/backfilling-with-non-direct-addresses).~ This also contains a modification to the backfill of `to_radix`. We were always generating valid decompositions and not flexing the `truncation = true` flag.
ludamad
approved these changes
Jan 2, 2026
Collaborator
ludamad
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 Auto-approved
Collaborator
Author
|
🤖 Auto-merge enabled after 4 hours of inactivity. This PR will be merged automatically once all checks pass. |
### Internal Call Fuzzer Closes: [AVM-101](https://linear.app/aztec-labs/issue/AVM-101/fuzz-internal-call) One note - the fuzzer discovered that we can overflow `next_pc` and fail a lookup due to a mismatch: - In the execution trace, we increment the pc by the size of the instruction in the field, so this can go over max(uint32) - In the internal call stack trace, we gather `next_pc` from `context`, which is cast as a uint32 For now, I just `static_cast` in the execution trace to get the fuzzer running. The fuzzer also discovered a lookup which should be gated (fix included in this branch and in #18945) ### Notes/TODOs See comments for TODOs (should be easier reading next to relevant bits of code!) Coverage is nearly 100% with caveats: * Trace: Internal call stack trace is at 100% and it looks like all internal call/return related lines are hit in the execution trace * "Gadget": there is no gadget, but `generated/relations/internal_call_impl.hpp` is at 100% which is a good sign. The call _stack_ gadget (`internal_call_stack_manager.cpp`) is at around 70% annoyingly because I don't hit the full getter `get_current_call_stack()`. I can just add a couple of lines in the fuzzer to grab the stack and do some checks to hit this.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
BEGIN_COMMIT_OVERRIDE
fix(fuzzer): Fixes for backfilling (#19279)
fix(avm): sha256 modulo add bug (#19262)
fix(avm): preserve coordinates in embedded curve point (#19263)
fix(avm): precomp trace for get_p_limbs (#19266)
feat(avm): change to_radix backfill (#19277)
feat(avm): internal call 'gadget' fuzzer (#18927)
END_COMMIT_OVERRIDE